Overview

Dataset statistics

Number of variables20
Number of observations338592
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory51.7 MiB
Average record size in memory160.0 B

Variable types

Numeric10
Categorical10

Warnings

HenkanFlag1 is highly correlated with HenkanFlag2 and 6 other fieldsHigh correlation
HenkanFlag2 is highly correlated with HenkanFlag1 and 6 other fieldsHigh correlation
HenkanFlag3 is highly correlated with HenkanFlag1 and 6 other fieldsHigh correlation
HenkanFlag4 is highly correlated with HenkanFlag1 and 6 other fieldsHigh correlation
HenkanFlag5 is highly correlated with HenkanFlag1 and 6 other fieldsHigh correlation
HenkanFlag7 is highly correlated with HenkanFlag1 and 6 other fieldsHigh correlation
HenkanFlag8 is highly correlated with HenkanFlag1 and 6 other fieldsHigh correlation
HenkanFlag9 is highly correlated with HenkanFlag1 and 6 other fieldsHigh correlation
HenkanFlag3 is highly correlated with HenkanFlag2 and 6 other fieldsHigh correlation
HenkanFlag2 is highly correlated with HenkanFlag3 and 6 other fieldsHigh correlation
HenkanFlag4 is highly correlated with HenkanFlag3 and 6 other fieldsHigh correlation
HenkanFlag9 is highly correlated with HenkanFlag3 and 6 other fieldsHigh correlation
HenkanFlag8 is highly correlated with HenkanFlag3 and 6 other fieldsHigh correlation
HenkanFlag5 is highly correlated with HenkanFlag3 and 6 other fieldsHigh correlation
HenkanFlag1 is highly correlated with HenkanFlag3 and 6 other fieldsHigh correlation
HenkanFlag7 is highly correlated with HenkanFlag3 and 6 other fieldsHigh correlation
Fukasyokin is highly skewed (γ1 = 72.86686313) Skewed
KettoNum2 is highly skewed (γ1 = 29.80015892) Skewed
Honsyokin has 224209 (66.2%) zeros Zeros
Fukasyokin has 319456 (94.3%) zeros Zeros
KettoNum2 has 338212 (99.9%) zeros Zeros
TimeDiff has 13592 (4.0%) zeros Zeros
DMGosaP has 90598 (26.8%) zeros Zeros
DMGosaM has 89136 (26.3%) zeros Zeros

Reproduction

Analysis started2021-04-07 12:50:28.994133
Analysis finished2021-04-07 12:52:00.415928
Duration1 minute and 31.42 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

Honsyokin
Real number (ℝ≥0)

ZEROS

Distinct326
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14268.05991
Minimum0
Maximum3000000
Zeros224209
Zeros (%)66.2%
Memory size2.6 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q311000
95-th percentile72000
Maximum3000000
Range3000000
Interquartile range (IQR)11000

Descriptive statistics

Standard deviation50759.18818
Coefficient of variation (CV)3.557539603
Kurtosis617.1685444
Mean14268.05991
Median Absolute Deviation (MAD)0
Skewness17.45624686
Sum4831050940
Variance2576495185
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0224209
66.2%
75007767
 
2.3%
110007244
 
2.1%
200006736
 
2.0%
130006468
 
1.9%
50005660
 
1.7%
500005572
 
1.6%
180004092
 
1.2%
190003766
 
1.1%
300003531
 
1.0%
Other values (316)63547
 
18.8%
ValueCountFrequency (%)
0224209
66.2%
20001
 
< 0.1%
23001
 
< 0.1%
24003
 
< 0.1%
250025
 
< 0.1%
ValueCountFrequency (%)
30000009
< 0.1%
25000005
 
< 0.1%
20000008
< 0.1%
18000001
 
< 0.1%
150000018
< 0.1%

Fukasyokin
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct516
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean171.484087
Minimum0
Maximum411110
Zeros319456
Zeros (%)94.3%
Memory size2.6 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile430
Maximum411110
Range411110
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3466.838772
Coefficient of variation (CV)20.21667918
Kurtosis6181.855096
Mean171.484087
Median Absolute Deviation (MAD)0
Skewness72.86686313
Sum58063140
Variance12018971.07
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0319456
94.3%
500300
 
0.1%
1000277
 
0.1%
3500266
 
0.1%
520255
 
0.1%
3570252
 
0.1%
3430248
 
0.1%
980247
 
0.1%
1020245
 
0.1%
490243
 
0.1%
Other values (506)16803
 
5.0%
ValueCountFrequency (%)
0319456
94.3%
1302
 
< 0.1%
1605
 
< 0.1%
1706
 
< 0.1%
1803
 
< 0.1%
ValueCountFrequency (%)
4111101
< 0.1%
3780701
< 0.1%
3731001
< 0.1%
3410401
< 0.1%
3266901
< 0.1%

HaronTimeL3_x
Real number (ℝ≥0)

Distinct287
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean369.2291253
Minimum128
Maximum999
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum128
5-th percentile336
Q1353
median369
Q3385
95-th percentile411
Maximum999
Range871
Interquartile range (IQR)32

Descriptive statistics

Standard deviation75.93568416
Coefficient of variation (CV)0.2056600603
Kurtosis41.88949404
Mean369.2291253
Median Absolute Deviation (MAD)16
Skewness4.136115284
Sum125018028
Variance5766.228129
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3555550
 
1.6%
3735550
 
1.6%
3595528
 
1.6%
3535522
 
1.6%
3705515
 
1.6%
3615507
 
1.6%
3565502
 
1.6%
3625500
 
1.6%
3545493
 
1.6%
3585472
 
1.6%
Other values (277)283453
83.7%
ValueCountFrequency (%)
1283
 
< 0.1%
12926
 
< 0.1%
130106
 
< 0.1%
131288
0.1%
132605
0.2%
ValueCountFrequency (%)
9992999
0.9%
9981
 
< 0.1%
8961
 
< 0.1%
8451
 
< 0.1%
8291
 
< 0.1%

KettoNum1
Real number (ℝ≥0)

Distinct19716
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2011865713
Minimum2001100686
Maximum2018110145
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum2001100686
5-th percentile2006105022
Q12009103182
median2012103370
Q32015101164
95-th percentile2017105089
Maximum2018110145
Range17009459
Interquartile range (IQR)5997982

Descriptive statistics

Standard deviation3478532.098
Coefficient of variation (CV)0.001729008092
Kurtosis-0.9857842664
Mean2011865713
Median Absolute Deviation (MAD)2998903.5
Skewness-0.1079470544
Sum6.812016356 × 1014
Variance1.210018556 × 1013
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2008103552147
 
< 0.1%
2015104961132
 
< 0.1%
2013100612130
 
< 0.1%
2011101125129
 
< 0.1%
2011104416119
 
< 0.1%
2009106253115
 
< 0.1%
2013103618114
 
< 0.1%
2012102013112
 
< 0.1%
2009102739109
 
< 0.1%
2014103547104
 
< 0.1%
Other values (19706)337381
99.6%
ValueCountFrequency (%)
20011006861
 
< 0.1%
20011024148
< 0.1%
20011030611
 
< 0.1%
20011034131
 
< 0.1%
20011062013
 
< 0.1%
ValueCountFrequency (%)
201811014514
< 0.1%
201811013815
< 0.1%
20181101321
 
< 0.1%
201811013112
< 0.1%
201811012915
< 0.1%

KettoNum2
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct82
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2258764.339
Minimum0
Maximum2018105310
Zeros338212
Zeros (%)99.9%
Memory size2.6 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum2018105310
Range2018105310
Interquartile range (IQR)0

Descriptive statistics

Standard deviation67386825.26
Coefficient of variation (CV)29.83349085
Kurtosis886.0576757
Mean2258764.339
Median Absolute Deviation (MAD)0
Skewness29.80015892
Sum7.647995352 × 1011
Variance4.540984218 × 1015
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0338212
99.9%
201510049415
 
< 0.1%
200710295815
 
< 0.1%
200710574114
 
< 0.1%
201610500914
 
< 0.1%
201710671713
 
< 0.1%
201610460313
 
< 0.1%
201510586813
 
< 0.1%
200710276013
 
< 0.1%
200810482212
 
< 0.1%
Other values (72)258
 
0.1%
ValueCountFrequency (%)
0338212
99.9%
200610040512
 
< 0.1%
20061011351
 
< 0.1%
200610678710
 
< 0.1%
20061101011
 
< 0.1%
ValueCountFrequency (%)
20181053101
 
< 0.1%
20181047167
< 0.1%
20171100831
 
< 0.1%
201710671713
< 0.1%
20171062551
 
< 0.1%

TimeDiff
Real number (ℝ)

ZEROS

Distinct292
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean102.6799186
Minimum-56
Maximum9999
Zeros13592
Zeros (%)4.0%
Memory size2.6 MiB

Quantile statistics

Minimum-56
5-th percentile0
Q15
median11
Q320
95-th percentile43
Maximum9999
Range10055
Interquartile range (IQR)15

Descriptive statistics

Standard deviation935.3433745
Coefficient of variation (CV)9.109311608
Kurtosis107.9251256
Mean102.6799186
Median Absolute Deviation (MAD)7
Skewness10.48298533
Sum34766599
Variance874867.2282
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
614747
 
4.4%
714626
 
4.3%
814385
 
4.2%
514311
 
4.2%
414094
 
4.2%
913755
 
4.1%
013592
 
4.0%
1013400
 
4.0%
313110
 
3.9%
212630
 
3.7%
Other values (282)199942
59.1%
ValueCountFrequency (%)
-561
< 0.1%
-321
< 0.1%
-281
< 0.1%
-271
< 0.1%
-262
< 0.1%
ValueCountFrequency (%)
99992997
0.9%
8361
 
< 0.1%
7591
 
< 0.1%
7101
 
< 0.1%
7031
 
< 0.1%

DMTime
Real number (ℝ≥0)

Distinct14860
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15169.35331
Minimum5481
Maximum51393
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum5481
5-th percentile10922
Q112197
median13818
Q315690
95-th percentile23404
Maximum51393
Range45912
Interquartile range (IQR)3493

Descriptive statistics

Standard deviation4906.767573
Coefficient of variation (CV)0.3234658376
Kurtosis5.366214458
Mean15169.35331
Median Absolute Deviation (MAD)1795
Skewness2.000914404
Sum5136221675
Variance24076368.02
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11331167
 
< 0.1%
11320165
 
< 0.1%
11303163
 
< 0.1%
11321162
 
< 0.1%
11307161
 
< 0.1%
11329159
 
< 0.1%
11330159
 
< 0.1%
11315158
 
< 0.1%
11346156
 
< 0.1%
11350155
 
< 0.1%
Other values (14850)336987
99.5%
ValueCountFrequency (%)
54811
< 0.1%
54841
< 0.1%
54892
< 0.1%
54901
< 0.1%
54911
< 0.1%
ValueCountFrequency (%)
513931
< 0.1%
511691
< 0.1%
511361
< 0.1%
511301
< 0.1%
510071
< 0.1%

DMGosaP
Real number (ℝ≥0)

ZEROS

Distinct101
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.75461913
Minimum0
Maximum100
Zeros90598
Zeros (%)26.8%
Memory size2.6 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median35
Q379
95-th percentile99
Maximum100
Range100
Interquartile range (IQR)79

Descriptive statistics

Standard deviation37.78880421
Coefficient of variation (CV)0.9050209293
Kurtosis-1.400703496
Mean41.75461913
Median Absolute Deviation (MAD)35
Skewness0.3467443339
Sum14137780
Variance1427.993723
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
090598
26.8%
9915974
 
4.7%
10015686
 
4.6%
9814203
 
4.2%
979844
 
2.9%
966512
 
1.9%
953977
 
1.2%
422792
 
0.8%
342684
 
0.8%
202650
 
0.8%
Other values (91)173672
51.3%
ValueCountFrequency (%)
090598
26.8%
11999
 
0.6%
21739
 
0.5%
31666
 
0.5%
41747
 
0.5%
ValueCountFrequency (%)
10015686
4.6%
9915974
4.7%
9814203
4.2%
979844
2.9%
966512
1.9%

DMGosaM
Real number (ℝ≥0)

ZEROS

Distinct101
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.09591485
Minimum0
Maximum100
Zeros89136
Zeros (%)26.3%
Memory size2.6 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median17
Q337
95-th percentile99
Maximum100
Range100
Interquartile range (IQR)37

Descriptive statistics

Standard deviation32.4381025
Coefficient of variation (CV)1.154548719
Kurtosis0.2309561045
Mean28.09591485
Median Absolute Deviation (MAD)17
Skewness1.243057452
Sum9513052
Variance1052.230494
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
089136
26.3%
9910802
 
3.2%
1009849
 
2.9%
988969
 
2.6%
177303
 
2.2%
147227
 
2.1%
137157
 
2.1%
157028
 
2.1%
167003
 
2.1%
186531
 
1.9%
Other values (91)177587
52.4%
ValueCountFrequency (%)
089136
26.3%
11404
 
0.4%
21976
 
0.6%
32656
 
0.8%
42706
 
0.8%
ValueCountFrequency (%)
1009849
2.9%
9910802
3.2%
988969
2.6%
976051
1.8%
963778
 
1.1%

DMJyuni
Real number (ℝ≥0)

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.01229208
Minimum0
Maximum18
Zeros1194
Zeros (%)0.4%
Memory size2.6 MiB

Quantile statistics

Minimum0
5-th percentile1
Q14
median8
Q312
95-th percentile16
Maximum18
Range18
Interquartile range (IQR)8

Descriptive statistics

Standard deviation4.505898074
Coefficient of variation (CV)0.5623731672
Kurtosis-1.027938731
Mean8.01229208
Median Absolute Deviation (MAD)4
Skewness0.155387219
Sum2712898
Variance20.30311745
MonotocityNot monotonic
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
823359
 
6.9%
723328
 
6.9%
623248
 
6.9%
523180
 
6.8%
422946
 
6.8%
922842
 
6.7%
322674
 
6.7%
222626
 
6.7%
122439
 
6.6%
1022369
 
6.6%
Other values (9)109581
32.4%
ValueCountFrequency (%)
01194
 
0.4%
122439
6.6%
222626
6.7%
322674
6.7%
422946
6.8%
ValueCountFrequency (%)
182193
 
0.6%
172745
 
0.8%
1612050
3.6%
1514608
4.3%
1416880
5.0%

KyakusituKubun
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
3
119181 
4
108578 
2
82502 
1
25334 
0
 
2997

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row2
4th row2
5th row3
ValueCountFrequency (%)
3119181
35.2%
4108578
32.1%
282502
24.4%
125334
 
7.5%
02997
 
0.9%
Histogram of lengths of the category
ValueCountFrequency (%)
3119181
35.2%
4108578
32.1%
282502
24.4%
125334
 
7.5%
02997
 
0.9%

Most occurring characters

ValueCountFrequency (%)
3119181
35.2%
4108578
32.1%
282502
24.4%
125334
 
7.5%
02997
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
3119181
35.2%
4108578
32.1%
282502
24.4%
125334
 
7.5%
02997
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
3119181
35.2%
4108578
32.1%
282502
24.4%
125334
 
7.5%
02997
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
3119181
35.2%
4108578
32.1%
282502
24.4%
125334
 
7.5%
02997
 
0.9%

HenkanFlag1
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
0
326123 
1
 
12469

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%
Histogram of lengths of the category
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring characters

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

HenkanFlag2
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
0
326123 
1
 
12469

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%
Histogram of lengths of the category
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring characters

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

HenkanFlag3
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
0
326750 
1
 
11842

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0326750
96.5%
111842
 
3.5%
Histogram of lengths of the category
ValueCountFrequency (%)
0326750
96.5%
111842
 
3.5%

Most occurring characters

ValueCountFrequency (%)
0326750
96.5%
111842
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
0326750
96.5%
111842
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
0326750
96.5%
111842
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
0326750
96.5%
111842
 
3.5%

HenkanFlag4
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
0
326123 
1
 
12469

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%
Histogram of lengths of the category
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring characters

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

HenkanFlag5
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
0
326123 
1
 
12469

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%
Histogram of lengths of the category
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring characters

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

HenkanFlag7
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
0
326123 
1
 
12469

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%
Histogram of lengths of the category
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring characters

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

HenkanFlag8
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
0
326123 
1
 
12469

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%
Histogram of lengths of the category
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring characters

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

HenkanFlag9
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
0
326123 
1
 
12469

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%
Histogram of lengths of the category
ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring characters

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
0326123
96.3%
112469
 
3.7%

HenkanUma1
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
0
337566 
1
 
1026

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters338592
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0337566
99.7%
11026
 
0.3%
Histogram of lengths of the category
ValueCountFrequency (%)
0337566
99.7%
11026
 
0.3%

Most occurring characters

ValueCountFrequency (%)
0337566
99.7%
11026
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number338592
100.0%

Most frequent character per category

ValueCountFrequency (%)
0337566
99.7%
11026
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Common338592
100.0%

Most frequent character per script

ValueCountFrequency (%)
0337566
99.7%
11026
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII338592
100.0%

Most frequent character per block

ValueCountFrequency (%)
0337566
99.7%
11026
 
0.3%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

HonsyokinFukasyokinHaronTimeL3_xKettoNum1KettoNum2TimeDiffDMTimeDMGosaPDMGosaMDMJyuniKyakusituKubunHenkanFlag1HenkanFlag2HenkanFlag3HenkanFlag4HenkanFlag5HenkanFlag7HenkanFlag8HenkanFlag9HenkanUma1
00035820071016470411011351162000000000
10036220061026180121103124593000000000
21480003622006103886091108325682000000000
3003652006106132071087500102000000000
4370005103512007101647031102019383000000000
52600071035120061051740211061342253000000000
6003522007110035051103901914000000000
771000033920051048760-8110420041000000000
8210000347200510023602110410073000000000
9280000353200610271700109630034111111110

Last rows

HonsyokinFukasyokinHaronTimeL3_xKettoNum1KettoNum2TimeDiffDMTimeDMGosaPDMGosaMDMJyuniKyakusituKubunHenkanFlag1HenkanFlag2HenkanFlag3HenkanFlag4HenkanFlag5HenkanFlag7HenkanFlag8HenkanFlag9HenkanUma1
3385823500057034620181091340610977972653000000000
33858300383201810913404010984972663000000000
3385840036120171060870161101300163000000000
33858500368201710392701420288100100174000000000
33858611000036020171049300559583631112000000000
3385870035720161044230181103610049114000000000
3385880036220151047930141473500102000000000
338589003802018105909059110666131154000000000
338590003622017104598062028010035134000000000
338591003812018104883038151029999144000000000